Website Overview Presentation

12/10/25

Guy Fuchs

My Data Science Portfolio

  • This presentation walks through the website I’ve built this semester
  • Main goal is to be a single place to view several data projects
  • I will briefly go over:
    1. Premier League Data Visualization
    2. Olympics Data Visualization
    3. Kidz Bop Censored Lyrics Analysis
    4. SQL Traffic Stop Analysis

Premier League Data Visualization

Question

  • How do teams differ in average goals scored at home in the 2021-22 season?

What I did

  • Used match level data from the Premier League TidyTuesday dataset
  • Grouped by home team and calculated average home goals

Premier League Takeaways

  • Manchester City and Liverpool had the most average home goals per match
  • Simple summaries already tell a useful story about performance
  • The same approach could be reused for any league or season

Olympics Data Visualization

Question

  • How do medal counts change over the years and what factors may affect this?

What I did

  • Used Olympic medal data from the Olympics TidyTuesday dataset
  • Wrangled the data to see how the total number of medals changed over time
  • Created a plot to spot long term trends and interruptions in the Games

Olympic Trend Plot

Olympics Takeaways

  • The overall trend climbs upward
  • Growth tied to bigger programs, more events, and wider participation across countries
    • More opportunities, especially in women’s competitions adds to the rise
  • Dips / no data lines up with major world events / interruptions
  • Gives good sense of how the Games have scaled and evolved

Kidz Bop Censored Lyrics Analysis

What Kidz Bop Censors Most

  • Used the Kidz Bop censored-lyrics dataset from The Pudding to understand what kinds of content get changed most often
  • Reflects the effort to keep songs upbeat and accessible for younger listeners

Relationship Themes Over Time

  • Looked for lyrics containing relationship-related words, such as love, kiss, partner, girlfriend, boyfriend, and similar terms
  • Calculated the share of lines each year that included any of those terms

Relationship Themes Plot

  • Peak around 2015 match time frame when pop music leaned heavily into romantic themes
  • Kidz Bop’s edits shift alongside those trends accordingly

SQL Traffic Stop Analysis

What I did

  • Used traffic stop data from the Stanford Open Policing Roject
  • Focused on Long Beach (CA), Mesa (AZ), and San Jose (CA)
  • Used SQL to:
    • Compare pedestrian vs vehicular stops over time
    • Compare citation rates across race and city

Traffic and Pedestrian Stops Over Time

  • Unioned the three cities’ data in order to group and filter it
  • Counted stops by city-year-type restrictions

What We Can Take From This

  • Vehicular stops dominate the totals in every city, level differs
  • Trends are different
    • Each department operates under its own patterns and volumes of activity

Citation Rates by Race Across Cities

  • Used same three city tables: Long Beach, Mesa, and San Jose
  • Restricted to years 2014-2016
  • Wrangled data to compute citation rate as citations divided by total stops
  • Question: do the citation rates mainly reflect differences across race, or differences in how each city records stop outcomes?

Citation Rates Plot

  • Compares citation rates by race for Long Beach, Mesa, and San Jose between 2014 and 2016

What the Patterns Suggest

  • Can potentially conclude that the datasets for Long Beach and Mesa may only include stops that led to a citation, or that non-citation outcomes were not consistently recorded or not recorded at all
  • San Jose has much lower citation rates at around a quarter of stops
    • Citation rates are driven not only by variation across cities, but also by differences in how the underlying agencies record their stops

Final Takeaways

  • Small, focused analyses can reveal clear differences across cities and policing practices
  • Public datasets like these make it possible to spot patterns that would otherwise stay hidden
  • Simple analysis and data visualization can provide meaningful insights

References